Linear Convergence of Variance-Reduced Stochastic Gradient without Strong Convexity

نویسندگان

  • Pinghua Gong
  • Jieping Ye
چکیده

Stochastic gradient algorithms estimate the gradient based on only one or a fewsamples and enjoy low computational cost per iteration. They have been widelyused in large-scale optimization problems. However, stochastic gradient algo-rithms are usually slow to converge and achieve sub-linear convergence rates,due to the inherent variance in the gradient computation. To accelerate the con-vergence, some variance-reduced stochastic gradient algorithms, e.g., proximalstochastic variance-reduced gradient (Prox-SVRG) algorithm, have recently beenproposed to solve strongly convex problems. Under the strongly convex condi-tion, these variance-reduced stochastic gradient algorithms achieve a linear con-vergence rate. However, many machine learning problems are convex but notstrongly convex. In this paper, we introduce Prox-SVRG and its projected variantcalled Variance-Reduced Projected Stochastic Gradient (VRPSG) to solve a classof non-strongly convex optimization problems widely used in machine learning.As the main technical contribution of this paper, we show that both VRPSG andProx-SVRG achieve a linear convergence rate without strong convexity. A keyingredient in our proof is a Semi-Strongly Convex (SSC) inequality which is thefirst to be rigorously proved for a class of non-strongly convex problems in bothconstrained and regularized settings. Moreover, the SSC inequality is independentof algorithms and may be applied to analyze other stochastic gradient algorithmsbesides VRPSG and Prox-SVRG, which may be of independent interest. To thebest of our knowledge, this is the first work that establishes the linear conver-gence rate for the variance-reduced stochastic gradient algorithms on solving bothconstrained and regularized problems without strong convexity.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-\L{}ojasiewicz Condition

In 1963, Polyak proposed a simple condition that is sufficient to show a global linear convergence rate for gradient descent. This condition is a special case of the Lojasiewicz inequality proposed in the same year, and it does not require strong convexity (or even convexity). In this work, we show that this much-older PolyakLojasiewicz (PL) inequality is actually weaker than the main condition...

متن کامل

Linear Convergence of Proximal-Gradient Methods under the Polyak-Łojasiewicz Condition

In 1963, Polyak proposed a simple condition that is sufficient to show that gradient descent has a global linear convergence rate. This condition is a special case of the Łojasiewicz inequality proposed in the same year, and it does not require strong-convexity (or even convexity). In this work, we show that this much-older Polyak-Łojasiewicz (PL) inequality is actually weaker than the four mai...

متن کامل

Projected Semi-Stochastic Gradient Descent Method with Mini-Batch Scheme under Weak Strong Convexity Assumption

We propose a projected semi-stochastic gradient descent method with mini-batch for improving both the theoretical complexity and practical performance of the general stochastic gradient descent method (SGD). We are able to prove linear convergence under weak strong convexity assumption. This requires no strong convexity assumption for minimizing the sum of smooth convex functions subject to a c...

متن کامل

Adaptive SVRG Methods under Error Bound Conditions with Unknown Growth Parameter

Error bound, an inherent property of an optimization problem, has recently revived in the development of algorithms with improved global convergence without strong convexity. The most studied error bound is the quadratic error bound, which generalizes strong convexity and is satisfied by a large family of machine learning problems. Quadratic error bound have been leveraged to achieve linear con...

متن کامل

Stochastic Variance Reduction for Nonconvex Optimization

We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (Svrg) methods for them. Svrg and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (Sgd); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014